MOB-NET-SSD: An Enhanced Real Time Object Identification Approach Based on Deep Learning

Authors: Shounak Bandyopadhyay, Sohini Banerjee, Avishek Gupta, Shayan Ghosh, Souvik Paul, Subhadip Das

DOI Link: https://doi.org/10.22214/ijraset.2025.72405

Abstract

A large, active, and complex field of computer vision dedicated to object identification and recognition is called real-time object detection. Using OpenCV (Open-source Computer Vision), a set of programming methods primarily trained towards real-time computer vision in digital photos and videos, object detection finds the semantic objects in a class. People with visual impairments are unable to recognize objects in their environment. Helping the blind overcome their challenges is the primary goal of this real-time object detection. Applications for real-time object detection include object tracking, video surveillance, people counting, pedestrian identification, self-driving automobiles, face detection, ball tracking in sports, and many more. Convolution Neural Networks, a type of deep learning technique, are used to do this. This article serves as a helpful resource for those who are visually impaired.

Introduction

The project focuses on object detection in images and videos using deep learning, specifically Convolutional Neural Networks (CNNs). It employs the Mobile-Net SSD (Single Shot MultiBox Detector) approach, combining Mobile-Net—a lightweight CNN architecture optimized for mobile and embedded devices—with SSD, a fast and accurate detection framework. This combination enables efficient real-time detection of multiple object classes without manual feature extraction.

The literature review highlights key object detection algorithms like YOLO (You Only Look Once) and Mobile-Net SSD, discussing their strengths in real-time applications such as autonomous vehicles, surveillance, robotics, and augmented reality. YOLOv3 and YOLOv4 Tiny offer improvements in speed and accuracy, while Mobile-Net SSD excels in resource-constrained environments.

The methodology describes CNN fundamentals, Mobile-Net’s architecture (using depthwise separable convolutions to reduce complexity), and SSD’s multi-scale detection using default bounding boxes ("priors"). The model is trained and evaluated on the MS COCO dataset, a large-scale annotated collection of images with diverse object categories.

Results demonstrate that after training for 50 epochs, the model achieved a validation accuracy of 90.3%, indicating strong performance for real-time object detection tasks.

Conclusion

In this study, we have presented MOB-NET-SSD, an enhanced real-time object identification approach that builds upon the strengths of Single Shot MultiBox Detector (SSD) and MobileNet, a lightweight deep neural network architecture. Our approach aims to provide an efficient and accurate solution for object detection in resource-constrained environments, such as mobile and embedded systems. Future work will focus on further refining the model to enhance its robustness and extend its applicability. Potential directions include incorporating advanced techniques such as attention mechanisms to improve object detection in cluttered and dynamic environments, as well as exploring the integration of MOB-NET-SSD with other sensory data to create a more holistic perception system. In conclusion, MOB-NET-SSD represents a significant advancement in real-time object detection technology, offering a balanced trade-off between speed and accuracy. Its ability to perform efficiently on low-power devices opens up new possibilities for deploying intelligent vision systems across various domains, paving the way for smarter and more responsive applications in the future.

References

[1] Mao, Qi-Chao, Hong-Mei Sun, Yan-Bo Liu, and Rui-Sheng Jia. \"Mini-YOLOv3: real-time object detector for embedded applications.\" Ieee Access 7 (2019): 133529-133538. [2] Masurekar, Omkar, Omkar Jadhav, Prateek Kulkarni, and Shubham Patil. \"Real time object detection using YOLOv3.\" International Research Journal of Engineering and Technology (IRJET) 7, no. 03 (2020):3764-3768. [3] Gai, Wendong, Yakun Liu, Jing Zhang, and Gang Jing. \"An improved Tiny YOLOv3 for real-time object detection.\" Systems Science & Control Engineering 9, no. 1 (2021): 314-321. [4] Zhang, Xiuling, Xiaopeng Dong, Qijun Wei, and Kaixuan Zhou. \"Real-time object detection algorithm based on improved YOLOv3.\" Journal of electronic imaging 28, no. 5 (2019): 053022-053022. [5] Srithar, S., M. Priyadharsini, F. Margret Sharmila, and R. Rajan. \"Yolov3 Supervised machine learning framework for real-time object detection and localization.\" In Journal of Physics: Conference Series, vol. 1916, no. 1, p. 012032.IOP Publishing, 2021. [6] Gunawan, Chichi Rizka, Nurdin Nurdin, and Fajriana Fajriana. \"Design of A Real-Time Object Detection Prototype System with YOLOv3 (You Only Look Once).\" International Journal of Engineering, Science and Information Technology 2, no. 3 (2022): 96-99. [7] Gong, Hua, Hui Li, Ke Xu, and Yong Zhang. \"Object detection based on improved YOLOv3-tiny.\" In 2019 Chinese automation congress (CAC), pp. 3240-3245. IEEE, 2019. [8] Pang, Lei, Hui Liu, Yang Chen, and Jungang Miao. \"Real-time concealed object detection from passive millimeter wave images based on the YOLOv3 algorithm.\" Sensors 20, no. 6 (2020): 1678. [9] Tan, Lu, Tianran Huangfu, Liyao Wu, and Wenying Chen. \"Comparison of RetinaNet, SSD, and YOLO v3 for realtime pill identification.\" BMC medical informatics and decision making 21 (2021): 1-11. [10] Chen, Zhihao, Redouane Khemmar, Benoit Decoux, Amphani Atahouet, and Jean-Yves Ertaud. \"Real time object detection, tracking, and distance and motion estimation based on deep learning: Application to smart mobility.\"In 2019 Eighth International Conference on Emerging Security Technologies (EST), pp. 1-6. IEEE, 2019. [11] Shill, Apu, and Md Asifur Rahman. \"Plant disease detection based on YOLOv3 and YOLOv4.\" In 2021 International Conference on Automation, Control and Mechatronics for Industry 4.0 (ACMI), pp. 1-6. IEEE, 2021. [12] Nepal, Upesh, and Hossein Eslamiat. \"Comparing YOLOv3, YOLOv4 and YOLOv5 for autonomous landing spot detection in faulty UAVs.\" Sensors 22, no. 2 (2022): 464. [13] Khan, Asjad M. \"Vehicle and pedestrian detection using YOLOv3 and YOLOv4 for self-driving cars.\" PhD diss.,California State University San Marcos, 2021.

Copyright

Copyright © 2025 Shounak Bandyopadhyay, Sohini Banerjee, Avishek Gupta, Shayan Ghosh, Souvik Paul, Subhadip Das. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET72405

Publish Date : 2025-06-10

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here